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Abstract. We present an algorithm to identify the types of supernova spectra, and determine their 
O , redshift and phase. This algorithm, based on the correlation techniques of Tonry & Davis, is 

implemented in the SuperNova IDentification code (SNID). It is used by members of the ESSENCE 
project to determine whether a noisy spectrum of a high-redshift supernova is indeed of type la, 
00 . as opposed to, e.g., type Ib/c. Furthermore, by comparing the correlation redshifts obtained using 

^—1 ' SNID with those determined from narrow lines in the supernova host galaxy spectrum, we show 

that accurate redshifts (with a typical error a z < 0.01) can be determined for SNe la for which a 
7— I ■ spectrum of the host galaxy is unavailable. Last, the phase of an input spectrum is determined with 

>• a typical accuracy of a, < 3 days. 
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)£ : INTRODUCTION 

O ■ 

Supernovae play a major role in the recent revival of observational cosmology. It is 
through observations of type-la supernovae (SNe la) over a large redshift range that two 
teams were able to independently confirm the present accelerated rate of the universal 
expansion [Ql 0]. This astonishing result has been confirmed in subsequent years at 
moderate redshifts [0, 0, HI], but also at higher (z > 1) redshifts where the universal 
expansion is in a decelerating phase Currently, two ongoing projects have the more 
ambitious goal to measure the equation-of-state parameter of the "dark energy" that 
drives the expansion: the ESSENCE [0] and SNLS ^\ projects. The success of these 
cosmological experiments depends, amongst other things, on the assurance that the 
supernovae in the sample are of the correct type, namely, SNe la. Inclusion of supernovae 
that are of a different type or exclusion of SNe la from the sample leads to biased 
cosmological parameters in the former case [@] and increased statistical errors on these 
same parameters in the latter. The secure classification of supernovae is a challenge at 
all redshifts, however. Even with high signal-to-noise ratio (S/N) spectra, the distinction 
between supernova of different types (or between subtypes withing a given type) can 
pose problems. 

The spectrum of a supernova also contains information on its redshift and phase. 
Knowledge of the SN redshift is necessary for the use of SNe la as distance indicators 
(though see ||10(1 for redshift-independent distances), and is usually determined via 
narrow lines in the spectrum of the host galaxy. When such a spectrum is unavailable, 
however, one has to rely on comparison with SN template spectra for determining the 
redshift. The SN phase is usually determined using a well-sampled lightcurve, but a 



single spectrum can also provide a relatively accurate estimate. Moreover, comparison 
of spectral and lightcurve phases of high-redshift supernovae can be used to test the 
general relativistic prediction of time dilation llllL ll2n. 

We have developed a tool (SuperNova IDentification; SNID) to determine the type, 
redshift and phase of a supernova, using a single spectrum. The algorithm is based on 
the correlation techniques of Tonry & Davis |13|1. and relies on the comparison of an 
input spectrum with a database of high-S/N template spectra. Our database presently 
comprises 796 spectra of 64 SNe la, 172 spectra of 8 SNe lb, 116 spectra of 9 SNe Ic, 
353 spectra of 10 SNe II, as well as spectra of galaxies, AGNs, and variable stars. 
The supernova spectra cover a broad range of phases, and span a sufficient restframe 
wavelength range (Amin < 4000 A; Am ax > 6500 A) to include all the identifying features 
of SN spectra. Most of these spectra are publicly available through the SUSPECTS and 
CfAl SN spectral archives. 

We briefly describe the cross-correlation technique in the next section, and then test 
the accuracy of correlation redshifts and phases using SNID. Last, we tackle the issue 
of supernova classification by focusing on two examples relevant to SN searches at high 
redshifts. 



CROSS-CORRELATION TECHNIQUES 

The cross-correlation method presented here is extensively discussed by Tonry and 
Davis II 1311 . where it is exclusively applied to galaxy spectra. Nonetheless, the formalism 
is easy to adapt to supernova spectra. The correlation technique is in principle fairly 
straightforward: a supernova spectrum, s(n), whose redshift, z, is to be found is cross- 
correlated with a template spectrum, t(n) (of known type and phase) at zero redshift. We 
want to determine the (1 + z) wavelength scaling, that maximizes the cross-correlation 
c(n) = s(n) -kt(n). In practice, it is convenient to bin the spectra linearly with In A, where 
A denotes the wavelength. Scaling the wavelength axis of t(n) by a factor (1 +z) is then 
equivalent to adding a ln(l +z) shift to the logarithmic wavelength axis of t(n), i.e. a 
(velocity) redshift corresponds to a uniform linear shift. 

We show the result of mapping an input supernova spectrum onto a logarithmic 
wavelength axis in Fig. [Q We show the input spectrum in panel (a) and its In A binned 
version in panel (b). The next step in preparing the spectra for correlation analysis is 
pseudo-continuum subtraction (11311: Fig. [fl panel (c)). The purpose is to effectively 
remove any intrinsic color information in the input and template spectra, and to ensure 
the correlation is not affected by reddening uncertainties or instrumental distortions. 
This effectively discards any spectral color information, and the correlation only relies 
on the relative shape and strength of spectral features in the input and template spectra. 
The final step is the application of a bandpass filter (Fig. Q3 panel (d)). The goal is to 
remove low-frequency residuals left over from the pseudo-continuum subtraction and 
high-frequency noise components. 
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FIGURE 1. Spectrum pre-processing. The input spectrum (a) is binned on a logarithmic wavelength 
scale (b). A /«ei<afo-continuum is subtracted (c) and the resulting spectrum is bandpass filtered (d). 



Tonry and Davis lfl3ll introduce a parameter, r, to quantify the significance of a 
correlation peak in c(n). It is defined as the ratio of the height h of the peak to the root- 
mean-square (RMS), o a , of the antisymmetric component of c(n) about the correlation 
redshift (see Fig. ©. A perfect correlation will have a peak with h = 1 at the exact 
redshift, and c(n) will be symmetric about this redshift, i.e. o a = and r — > °°. 
Conversely, r will be small (r < 3) for a spurious correlation peak, and large (r > 8) 
for a significant peak, since h will be close to 1 and o a will be small. The width of the 
correlation peak, w, is used to formally evaluate the redshift error (not discussed here). 

The r- value is further weighted by the overlap in In A space (at the correlation redshift) 
between the input spectrum and each of the template spectra used in the correlation. 
The template spectra are trimmed to match the wavelength range of the input spectrum 
at the redshift corresponding to the correlation peak. The overlap value, lap, conveys 
important absolute information about the quality of the correlation, complementary 
to the correlation parameter r. We usually discard correlation redshifts that have an 
associated lap < lap m [ n = 0.40 and a combined rlap = r x lap < rlap m [ n = 5. 



REDSHIFT AND PHASE DETERMINATION 



We use a simple simulation to test the accuracy of SNID in determining the redshift 
and phase of a supernova spectrum. Here we only consider "normal" type-la supernovae 





1.0 




ii ii I i i ii ii ii ii ii I ii 
antisymmetric (Jomponent 


i 




0.8 










- 


plitud 


0.6 








r = h/2a a 


- 


£ 












relation 


0.4 




h 








lized cor 


0.2 












uma 


0.0 












Z 


-0.2 
-0.4 




,,,,1 




i i i i 1 i i i i 1 i i 


I 



Z SNID 

Redshift, z 

FIGURE 2. The correlation r-value is defined as the ratio of the height, h, of the highest peak in the 
normalized correlation function (solid line) to twice the RMS, (7„, of its antisymmetric component (dotted 
line). 



since they are the most represented in our spectral database, though the conclusions 
announced in this section are qualitatively valid for all other supernova types. 

In this simulation, each normal SN la spectrum in the database is correlated with all 
other normal SN la spectrum using SNID, after having taken care to temporarily remove 
all spectra corresponding to the input supernova from the database. The input spectrum 
is first redshifted at random in the interval 0. 1 < z < 0.7. We then add noise (both random 
Poisson noise and sky background) to reproduce the range of typical signal-to-noise ratio 
of SN spectra at the simulation redshifts, when observed with 8-10m-class telescopes 
(e.g., VLT, Keck, Gemini) used in cosmological SN la surveys. 

We show the distribution of redshift residuals, Az vs. the rlap parameter in the upper 
panel of Fig. [31 for input spectra satisfying 0.3 < z < 0.5; — 5 < t [days] < +15 (t is 
the SN phase); 2 < S/N (per A) < 10. The residuals are shown as a two-dimensional 
histogram, with a grayscale scheme reflecting the number of points in a given [Az, rlap] 
2D bin (darker for more points). We only show correlations for which the overlap 
between input and template spectra lap > 0.40. For good correlations (rlap > 5), the 
distribution of redshift residuals is a Gaussian centered at Az = 0. In the lower panel, 
we show the standard deviation of redshift residuals, o z , in rlap bins of size unity. For 
rlap > 5, we have a typical error in redshift of order o z < 0.01. 

For poor correlations (rlap < 3) there is a concentration of points around Az ~ — 0.01 . 
This is due to the SN la template phase distribution in our database: many input spectra 
at post-maximum phases (t < +10 days) are attracted to higher phases (> +10 days), 
where the position of SN spectral features has shifted redward in wavelength due to the 
expansion of the supernova envelope. The template needs to be shifted less in In A space 
to match the redshift of the input spectrum, which leads to an under-estimation of the 
redshift (by ~ 0.01) corresponding to a combination of the typical velocity shift in SN la 
absorption features from maximum to ~ 10 days past maximum, and the spread of these 
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FIGURE 3. Upper panel: Redshift residuals vs. rlap. Lower panel: Standard deviation of redshift 
residuals in rlap bins of size unity. For rlap > 5, OV < 0.01. 



velocities at a given phase for different supernovae (~ 3000 km s _1 ; see lll4l.ll5ll). 

This covariance between redshift and phase (over-estimating the phase leads to under- 
estimating the redshift, and vice versa) suggests that priors on one parameter should 
improve the accuracy of the other. Fig, |4] shows the effect on the distributions phase 
residuals (for rlap > 5) of adding a flat ±0.01 prior on redshift. As expected, a prior on 
redshift slightly improves the phase determination (o> = 3.4 days to o t = 2.9 days). A 
flat ±3-day prior also improves the redshift determination (o z = 0.006 to o z = 0.004, 
not shown here). In practice, the prior on redshift generally comes from a spectrum of 
the SN host galaxy, and one can impose a prior on phase if a well-sampled lightcurve 
of the supernova (i.e. one for which the date of maximum light is easily determined) is 
available. 

We test the accuracy of correlation redshifts using SNID by comparing it with that 
obtained from narrow emission/absorption lines in the host galaxy spectrum. We have 
selected high-redshift SN la spectra taken by members of the ESSENCE team lfl6l 
for which a redshift of the host galaxy could be obtained. This amounts to 47 
SN la spectra in the redshift range 0.164 < z < 0.781. The result of this comparison 
is shown in Fig. \5\ The upper panel is a plot of the supernova redshift determined 
via cross-correlation using SNID, vs. that determined from narrow lines in the host 
galaxy spectrum. The dispersion about the one-to-one correspondence of the redshifts is 
excellent, with o ~ 0.006 over the whole redshift range. This is in good agreement with 
the expected redshift residual found from simulations. The lower panel shows a plot of 
the redshift residuals as a function of the galaxy redshift. 



TYPE DETERMINATION 



The previous results are only valid if we assume we know the type of the input supernova 
spectrum- in this case a "normal" SN la. Although SNID is tuned to determining SN 
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FIGURE 4. Effect of a redshift prior on the phase determination. With no redshift prior (right), o t = 3 .4 
days. With a flat ±0.01 redshift prior, <J t = 2.9 days. 



0.8 

q 0.7 

z 

S 0.6 



-a 



a 

M 

1) 

a, 



0.5 
0.4 
0.3 

0.2 

0.1 
0.02 

0.00 

-0.02 



I 1 1 1 1 I 



47 SN la spectra 
0.164 < z < 0.781 



a z = 0.006 



i i i i i i i i i i i i i i i i i i i i i i i 




0.1 0.2 0.3 0.4 0.5 0.6 
Galaxy redshift, z GAL 



0.7 



0.8 



FIGURE 5. Upper panel: Comparison of redshifts determined from cross-correlations with SN la 
templates using SNID (zsnid) an d from narrow lines in the host galaxy spectrum (zgal)- The dispersion 
about the one-to-one correspondence is a, w 0.006 over the redshift range 0.164 < z < 0.781. Lower 
panel: Redshift residuals vs. zgal- The data are from the ESSENCE project B161I17I1 . 



redshifts, we investigate its potential in assigning a probability to the input spectrum 
being of a certain type. We focus on two distinct examples, particularly relevant to 
ongoing high-redshift SN la searches: the distinction between 1991T-like SNe la and 
other type-la supernovae, and the increasing difficulty to distinguish between type-Ic 
supernovae and SNe la at high redshifts. 

It can be a challenge to distinguish the subtypes of SNe la from one another (Fig.[6l 
right). 1991T-like SNe la supernovae have a peak luminosity at the bright end of the 
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FIGURE 6. Identification of a 1991T-like SN la at z = 0.5, around maximum light (right) and past 
maximum light (middle). The curves correspond to the fraction of templates of a given type greater than 
a given rlap value: 1991T-like SNe la (solid); other SNe la (dashed); other SN types (dotted). The left 
panel shows representative maximum-light spectra of the various SN la subtypes, as observed with a 
typical optical spectrograph at z — 0.5. Note that the relative differences in pseMc/o-continuum shapes has 
no impact on the SNID results. 

SN la distribution, and although their lightcurves still obey the Phillips relation it 
is useful to have an independent confirmation of their high intrinsic luminosity via their 
spectra. Spectra of 199 IT- like SNe la are characterized by the near- absence of Ca II and 
Si II lines in the early-time spectra, and prominent high-excitation features of Fe III- not 
found in "normal" SNe la. The Si II, S II, and Ca II features develop during the post- 
maximum phases, and by ~ 2 weeks past maximum the spectra of "1991T-like" objects 
are similar to those of "normal" SNe la. 

In Fig. [6] (/<?//) we illustrate the ability for SNID to identify 1991T-like SNe la around 
maximum light (i.e. when the spectroscopic differences with "normal" SNe la are 
most apparent) at z = 0.5. We show the fraction of templates in the SNID database 
that correlate with the input spectrum, as a function of the rlap parameter: 1991T-like 
SNe la (solid line); other SNe la (dashed line); supernovae of other types (dotted line). 
For rlap > 10, the fraction of 199 IT- like templates dominates over the other SN la 
subtypes. The confusion with other SN types is always low and inexistent for rlap > 5. 
At ~ 1 — 2 weeks past maximum light, however, the distinction is more difficult to make 
(as expected) with roughly a 50% probability to recover another SN la subtype (Fig. [6l 
middle). 

The mis-identification of supernova of other types as SNe la is a major concern for 
ongoing high-redshift SN la searches (Fig. [71 right). Including only a small fraction non- 
la supernovae in a sample will lead to a mis-calibration of the absolute magnitudes of 
these objects, and to biases in the derived cosmological parameters [9]. A particular 
concern is the contamination of high-z SN la samples with type-Ic supernovae. At 
redshifts z > 0.4, the defining Si II A6355 absorption feature of SNe la (also present, 
though somewhat weaker, in SNe Ic) is redshifted out of the range of most optical 
spectrographs, and one has to rely on spectral features blueward of this to determine the 
supernova type. Some of these features, such as the Ca II H&K AA3934,3968 doublet, 
are common to both SNe la and SNe Ic. Other features characteristic of SN la spectra 
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FIGURE 7. Identification of a normal SN Ic around maximum light at z = 0.5, with no prior on redshift 
or phase (right), and with a flat ±0.01 prior on redshift and a ±3-day prior on the phase (middle). The 
curves correspond to the fraction of templates of a given type greater than a given rlap value: normal 
SNe Ic (solid); SNe la (dashed); other SN types (dotted). The left panel shows representative maximum- 
light spectra of SNe la and SNe Ic, as observed with a typical optical spectrograph at z — 0.5. Note that 
the relative differences in /wet<afo-continuum shapes has no impact on the SNID results. 

around maximum light (e.g. S II X A5454, 5640) are generally weak and can be difficult 
to detect in low-S/N spectra. One has to invoke external constraints, such as the SN color 
evolution, lightcurve shape, host galaxy morphology (only SNe la occur in early-type 
hosts; IU9I1 ). or the expected apparent peak magnitude: type-Ic supernovae at maximum 
light are often > 1 mag fainter than SNe la, and hence are not expected to pollute 
magnitude-limited samples of SNe la at high redshift (although Clocchiatti et al. 
have reported on a type-Ic SN with a similar absolute magnitude as normal SNe la). 

In Fig. |7](/e//) we illustrate the ability for SNID to identify SNe Ic around maximum 
light (-5 < t [days] < +5) at z = 0.5. We show the fraction of templates in the SNID 
database that correlate with the input spectrum, as a function of the rlap parameter: 
"normal" SNe Ic (solid line); SNe la (dashed line); supernovae of other types (dotted 
line). For "good" correlations (rlap > 5), the input spectrum almost always correlates 
with supernovae of other types. With an additional flat ±0.01 prior on redshift and a 
flat ±3-day prior on the phase, the fraction of SN Ic templates dominates for rlap > 
5 (Fig. El middle). In the absence of such priors, one needs to invoke non-spectral 
constraints on the type to identify potential SN Ic contaminants in high-redshift SN la 
samples. 

CONCLUSION AND FUTURE WORK 

We have presented an algorithm, based on the correlation techniques of Tonry & Davis 
[fl3|] . which can be used to determine the redshift and phase of a supernova spectrum and 
place constraints on its type. We develop a diagnostic, the rlap parameter, to quantify the 
quality of a given correlation between the input and a template spectrum. This parameter 
is simply the product of the Tonry & Davis r-value and the overlap, lap, in InA space 
between the input and template spectrum at the correlation redshift. We show, based 



on simulations, that for rlap > 5, the typical error on redshift and phase is o z < 0.01 
and a t < 3 days, respectively. The former accuracy on redshift is confirmed through a 
comparison of correlation redshifts with host-galaxy redshifts (determined from narrow 
lines in the spectrum) out to redshifts z < 0.8. 

We present first results of an impartial and effective spectroscopic classification of 
supernovae, based on the cumulative fraction of correlations exceeding a certain rlap 
cutoff. We illustrate this through two examples, relevant to ongoing SN la searches at 
high redshift: we are able to distinguish 1991T-like SNe la from other SNe la at z = 0.5, 
if the input spectrum is within five days from maximum light; we identify a type-Ic 
supernova as such at z = 0.5, but only when an additional prior on redshift and phase is 
applied. These examples both illustrate the success and limitations of such an automated 
classification scheme, and highlight the complementarity between spectroscopic and 
photometric observations in determining the supernova type. 

The current version of SNID will soon be made available to the community, and 
we plan to set up a web-based interface for instantaneous supernova typing (and red- 
shift/phase determination). Future versions of SNID will include a wavelength- weighted 
lap parameter, an explicit treatment of the co variance between redshift and phase, and 
a Bayesian approach to type determination, as currently used for photometric classifica- 
tion of supernovae ||2lLl221. Moreover, more spectral templates are continuously being 
included in the SNID database, which directly impact the ability for SNID to securely 
identify input spectra. 
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